🤖 feat: boundary-windowed chat loading + metadata-only workspace activity#2493
🤖 feat: boundary-windowed chat loading + metadata-only workspace activity#2493
Conversation
Change-Id: I34b8bbedfb456c8b3a27b4906047082b9a448284 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Switch startup replay to the latest boundary and add backend paging for older compaction epochs. - changed AgentSession replay to start at skip=0 (latest compaction boundary) - added HistoryService.getHistoryBoundaryWindow() to return one older epoch window plus hasOlder - exposed workspace.history.loadMore in ORPC schema/router/workspace service - added historyService tests covering boundary-window paging behavior Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I7f1ada3dd8daf93ec12ff18bc09edbcfe63758ed Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I3da0fed5263b45bc367ef1e243cf083bfbac57c7 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Id0ffd056c360c8879fdf33c6fc1ddd3fc275fb48 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f2200cbcfc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: Iabe3f84b0cefe77a87863146377117904853bc62 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
Change-Id: Ifebd5176f6d91f2f393e082754571819265a76f4 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review Addressed both review comments:
|
Change-Id: I8b332c398a28b788587425f78d634bd72b1b3d99 Signed-off-by: Thomas Kosiewski <tk@coder.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 48b6215b01
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ed pagination threshold Change-Id: Ia78e76e48c4466a1f68a2110c511e12e9f44a43c Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review Addressed both new review comments:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f09caa963
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: Ic7544045454bbdc5e3d544ba678a67ddfb77d1d7 Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I5b43c0ea2aba795d3b5ea441c56a642c09a0fe25 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
Change-Id: I723b02f703e4401310ecc2f88a3e4891811a9be9 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 019062e776
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: Ic6b8914985a58d6c4a4cb7f81de5ef0ee01868ea Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I3020443107312cd5ea5eeb0e4bbf8c367571bf9e Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
When switching to a workspace that was streaming in the background, there's a brief window where the aggregator is cleared and replaying history. During this window, trust the activity snapshot for canInterrupt/model/thinkingLevel instead of the empty aggregator state. Uses the existing transient.caughtUp flag as the guard: only trust the aggregator once the onChat replay has delivered the caught-up marker. Change-Id: I6d88bb91ced5ba6ce2911ffefdce4b48a6342312 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a6687897b9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The setActiveWorkspaceId call was a no-op when the workspace hadn't been registered in the store yet (isWorkspaceRegistered returns false). In integration tests, the WorkspaceContext sync may not have completed by the time setupWorkspaceView runs. Call addWorkspace(metadata) first to guarantee registration before activation. Also expose addWorkspace on the workspaceStore wrapper for test access. Change-Id: I972c6ecc5ce02e0954bbcd8a73131be4ba9727e8 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
|
@codex review Addressed the latest P2 race by dropping queued Generated with |
|
Codex Review: Didn't find any major issues. 🚀 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Increase Storybook chat story wait tolerances to avoid flaky retries/timeouts on slower CI runners: - raise waitForChatMessagesLoaded timeout to 25s - add explicit waitFor timeouts in ModelSelectorPrettyWithGateway - widen Exec/tooltip wait windows in affected App.chat stories This reproduces and fixes the intermittent `Test / Storybook` failures seen on PR #2493. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$51.56`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=51.56 --> Change-Id: Ia0494edbf9e51c47ea3718a46491338b3de176c4
|
@codex review Addressed a reproducible Storybook flake seen on this PR by hardening chat story waits/timeouts (local repro + pass with CI-equivalent Generated with |
|
Codex Review: Didn't find any major issues. Breezy! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Address repeated Storybook runner flakes by: - raising the global storybook test timeout (`make test-storybook --testTimeout 30000`) - removing the brittle `ModeHelpTooltip` play interaction (hover media-query dependent) - making `ShowCodeView` wait on the actionable code button instead of transcript loaded markers This keeps interaction coverage for stable paths while unblocking CI from non-deterministic story startup timing. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$51.56`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=51.56 --> Change-Id: I43f3b6b8558b9357ccfe2328ae9e657ccf0af146
|
@codex review Follow-up for the persistent Storybook flake:
Local repro/validation now passes reliably with CI-equivalent steps. Generated with |
`App/CodeExecution › ShowCodeView` remained non-deterministic in CI and local CI-equivalent runs due startup timing races. Remove the brittle play interaction while retaining the visual story. This pairs with the previous Storybook timeout hardening to stabilize the `Test / Storybook` gate for this PR. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$51.56`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=51.56 --> Change-Id: I79bdfea3cb18f81a59667e284d654a62b08b83dc
|
@codex review Final Storybook flake follow-up: removed the remaining non-deterministic Generated with |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Reintroduce the ModeHelpTooltip story play interaction, but make it deterministic for CI/Chromatic by forcing the model help trigger visible before hover and asserting tooltip content. Also add a stable data-component hook on the model help wrapper so the story can target the intended tooltip trigger without brittle selectors. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$52.03`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=52.03 --> Change-Id: Ief635d8d799767d58b9ac78fb3537445166da899
|
@codex review I restored deterministic coverage for the ModeHelpTooltip story by reintroducing the play interaction with a stable selector hook and forcing the trigger visible in headless CI before hover. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 11763b9ccc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Replace the App-backed ModeHelpTooltip story with an isolated tooltip fixture wrapped in TooltipProvider. The story now verifies hover behavior and tooltip content deterministically without depending on full app hydration timing. Also drop the temporary ChatInput `data-component` test hook since the isolated story no longer needs it. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$52.03`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=52.03 --> Change-Id: I0420cc3458b59197f6dcc781d0fa84d1a70f78b6
|
@codex review I replaced the flaky App-backed ModeHelpTooltip interaction with an isolated TooltipProvider fixture that still validates hover-triggered help content, and reran Storybook interactions locally. |
Clarify in the ShowCodeView story that CodeExecutionToolCall automatically switches to the code tab when execution completes without nested tool calls. This keeps the story's intent explicit without reintroducing a flaky interaction step. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$52.03`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=52.03 --> Change-Id: I32f5010d3649ff2fc0b74ca5def6d1acfbfe55e0
|
@codex review Addressed the unresolved ShowCodeView thread by documenting that the story intentionally relies on CodeExecutionToolCall's built-in auto-switch to code view (complete + no nested calls), and resolved the thread. |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
…#2530) Summary Fixes a regression where the context usage meter disappears after switching back to a workspace that compacted while backgrounded. Background Recent boundary-windowed replay behavior only scans the active compaction epoch for context usage. When the newest message in that epoch is the compaction boundary summary and that summary has no `contextUsage`, the UI shows no usage until later tool/model events arrive. Implementation - `CompactionHandler` now sanitizes compaction stream-end metadata by stripping stale provider metadata while attaching a post-compaction context estimate (`systemMessageTokens + summary output tokens`) as `contextUsage` when available. - `WorkspaceStore` now checks compaction boundary messages for `contextUsage` before stopping its backwards epoch scan. - Added targeted regression tests in both `compactionHandler.test.ts` and `WorkspaceStore.test.ts`. Risks Low-to-moderate. The behavior change is scoped to post-compaction metadata and usage display fallback. Tests cover both estimate presence and omission paths. --- <details> <summary>📋 Implementation Plan</summary> # Fix: Context Usage Meter Disappears on Workspace Switch ## Context / Why After the boundary-windowed chat loading change (PR #2493) and auto-compaction backend move (PR #2469), the context usage meter disappears when switching between workspaces. The user must wait for one or two agent tool calls before it reappears. **Root cause:** When idle compaction fires while a workspace is backgrounded, the compaction summary intentionally strips `contextUsage` (to avoid displaying stale pre-compaction values). On switch-back, the frontend replays only the current epoch (post-boundary). If the only message in that epoch is the compaction summary (which has no `contextUsage`), the backward scan returns `undefined` → meter shows empty. ## Evidence | File | What it tells us | |---|---| | `src/node/services/compactionHandler.ts:515-535` | `sanitizeCompactionStreamEndEvent` strips `contextUsage`, `providerMetadata`, `contextProviderMetadata` from the compaction stream-end event | | `src/node/services/compactionHandler.ts:688-706` | Compaction summary message has `systemMessageTokens` and `usage` (with `outputTokens` = summary size) but no `contextUsage` | | `src/browser/stores/WorkspaceStore.ts:1690-1706` | Backward scan `break`s at `isDurableCompactionBoundaryMarker` **without** checking the boundary message for `contextUsage` | | `src/node/services/agentSession.ts:2323-2360` | Backend `seedUsageStateFromHistory` scans the boundary epoch (including boundary msg) for `contextUsage` — currently finds nothing because boundary has none | | `src/common/utils/messages/compactionBoundary.ts:11-13` | `hasDurableCompactedMarker` accepts `true \| "user" \| "idle"` | | `src/node/services/compactionMonitor.ts` | Auto-compaction threshold is 70% (`DEFAULT_AUTO_COMPACTION_THRESHOLD = 0.7`), triggered by backend's own `lastUsageState` — completely independent of frontend display | ## Approach: Post-Compaction Context Estimate on Boundary Messages Instead of stripping `contextUsage` entirely, **replace** it with a computed post-compaction estimate representing the approximate context window size *after* compaction (system prompt + summary). ### Why this is safe from compaction loops The post-compaction estimate is inherently small: ``` estimate = systemMessageTokens + compactionSummary.outputTokens ≈ 5–15% of context window ``` - Auto-compaction triggers at **70%** — the estimate is far below this. - Backend `seedUsageStateFromHistory` would find this small value on restart → no re-compaction. - The compaction trigger (`CompactionMonitor`) uses the backend's in-memory `lastUsageState` from actual provider responses, **not** the frontend display. Even if the estimate were somehow wrong, it cannot cause a backend loop. ### What the user sees | State | Meter | |---|---| | Before compaction | 70% (real) | | After compaction, before next response | ~10% (estimate — system prompt + summary) | | After next agent response | Real value from new `contextUsage` | ## Implementation Details (~20 net LoC) ### 1. Backend: Compute estimate in `CompactionHandler` **File:** `src/node/services/compactionHandler.ts` In `sanitizeCompactionStreamEndEvent`, instead of stripping `contextUsage`, replace it with a post-compaction estimate: ```typescript private sanitizeCompactionStreamEndEvent(event: StreamEndEvent): StreamEndEvent { const { providerMetadata, contextProviderMetadata, contextUsage, timestamp, ...cleanMetadata } = event.metadata; // Compute a post-compaction context estimate: system prompt + summary tokens. // This gives the frontend a directionally-correct "near empty" reading while // preventing stale pre-compaction values from inflating the meter. const postCompactionContextEstimate = this.computePostCompactionContextEstimate( cleanMetadata.systemMessageTokens, cleanMetadata.usage, ); const sanitizedEvent: StreamEndEvent = { ...event, metadata: { ...cleanMetadata, ...(postCompactionContextEstimate && { contextUsage: postCompactionContextEstimate }), }, }; assert( sanitizedEvent.metadata.providerMetadata === undefined && sanitizedEvent.metadata.contextProviderMetadata === undefined, "Compaction stream-end event must not carry stale provider metadata", ); return sanitizedEvent; } /** * Approximate context window size after compaction: system prompt + summary. * Returns undefined if inputs are missing (graceful fallback to no-data). */ private computePostCompactionContextEstimate( systemMessageTokens: number | undefined, usage: LanguageModelV2Usage | undefined, ): LanguageModelV2Usage | undefined { const summaryTokens = usage?.outputTokens; if (summaryTokens == null || summaryTokens <= 0) return undefined; const systemTokens = systemMessageTokens ?? 0; const estimatedInputTokens = systemTokens + summaryTokens; return { inputTokens: estimatedInputTokens, outputTokens: 0, }; } ``` ### 2. Frontend: Read boundary's `contextUsage` before breaking **File:** `src/browser/stores/WorkspaceStore.ts` Modify the backward scan (~line 1693) to check the boundary message for `contextUsage` before breaking: ```typescript const lastContextUsage = (() => { for (let i = messages.length - 1; i >= 0; i--) { const msg = messages[i]; if (isDurableCompactionBoundaryMarker(msg)) { // Boundary may carry a post-compaction context estimate. // Check before breaking so the meter shows "near-empty" instead of nothing. const rawUsage = msg.metadata?.contextUsage; if (rawUsage && msg.role === "assistant") { const msgModel = msg.metadata?.model ?? model ?? "unknown"; return createDisplayUsage(rawUsage, msgModel, undefined); } break; } if (msg.role === "assistant") { if (msg.metadata?.compacted) continue; // ... existing contextUsage extraction (unchanged) } } return undefined; })(); ``` ### 3. Update assertion message **File:** `src/node/services/compactionHandler.ts` The existing assertion (line 527-531) checks `contextUsage === undefined`. Update it to allow the new estimate: ```typescript assert( sanitizedEvent.metadata.providerMetadata === undefined && sanitizedEvent.metadata.contextProviderMetadata === undefined, "Compaction stream-end event must not carry stale provider metadata", ); ``` (Remove the `contextUsage === undefined` check from the assertion.) ### 4. Tests - **CompactionHandler tests:** Verify `sanitizeCompactionStreamEndEvent` produces a post-compaction estimate (not the pre-compaction value) when `systemMessageTokens` and `usage.outputTokens` are available, and produces `undefined` when they're missing. - **WorkspaceStore tests (or unit test for the scan logic):** Verify the backward scan reads `contextUsage` from a boundary message when no newer messages have it. - **Integration/manual:** Switch between workspaces after idle compaction → meter should show a small value instead of disappearing. <details> <summary>Alternatives considered</summary> ### Alt A: Store `lastContextUsage` in `session-usage.json` Persist `lastContextUsage` to disk alongside existing session usage data. Clear it on compaction. - **Pro:** Survives restarts independently of message history. - **Con:** After compaction, still shows empty (cleared). Adds new persistence field + service changes. - **Con:** More moving parts, touches both `SessionUsageService` and `WorkspaceStore`. ### Alt B: Include `lastContextUsage` in the `caughtUp` IPC payload Backend sends its in-memory `lastUsageState` in the `caughtUp` event. - **Pro:** Always authoritative, no new persistence. - **Con:** `lastUsageState` may be stale/undefined after compaction. Requires IPC schema change. Frontend still needs fallback logic. ### Alt C: Frontend caches value before clearing aggregator Before `resetChatStateForReplay` clears the aggregator, snapshot the current `lastContextUsage` and use as fallback. - **Pro:** Frontend-only change. - **Con:** Shows stale pre-compaction value (70%) then drops to real value — worse UX. Doesn't survive restart. ### Alt D: Don't strip `contextUsage` from compaction boundary (carry forward original) - **Con:** This is the infinite loop scenario the user warned about. Backend `seedUsageStateFromHistory` would read the inflated pre-compaction value, seed `lastUsageState` to 70%, and `checkBeforeSend` would immediately trigger another compaction. </details> </details> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$8.29`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=8.29 -->
…t totals (#2546) ## Summary Fixes stale workspace cost totals when switching between workspaces. The frontend fetched persisted session usage (`session-usage.json`) only once during workspace registration (`addWorkspace`), so cost rollups arriving while a workspace was inactive (e.g., sub-agent deletions) were never picked up until a hard refresh. ## Background PR #2493 introduced boundary-windowed chat loading, where only the active workspace receives a full `onChat` subscription. Non-active workspaces use metadata-only activity feeds. This means `session-usage-delta` events — which carry sub-agent cost rollups — only reach the currently active workspace. Since `getSessionUsage()` was only called once during `addWorkspace()`, switching back to a workspace that received rollups while inactive would show stale (lower) cost totals. Users reported workspaces dropping from ~$20 to <$5 until `Ctrl+Shift+R` forced a full reload. ## Implementation - Extracted the inline `getSessionUsage` fetch + repricing logic from `addWorkspace()` into a shared `refreshSessionUsage()` private method. - Added a per-workspace request version guard so slower/older fetch responses cannot overwrite fresher state during rapid workspace switches. - `setActiveWorkspaceId()` now calls `refreshSessionUsage()` for the newly active workspace, re-hydrating any cost data that arrived while it was inactive. - Cleanup: `removeWorkspace()` clears the request-version tracking for deleted workspaces. ## Validation - New regression tests: - Verifies activation triggers a fresh `getSessionUsage` fetch and hydrates `sessionTotal` - Verifies stale in-flight responses are dropped when a newer refresh supersedes them - All existing `WorkspaceStore.test.ts` tests pass - `make typecheck` passes --- _Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking: `xhigh`_ <!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh costs=9.21 -->
Summary
Overhaul chat subscription architecture to load only the current compaction epoch on startup, scope full transcript streaming to the active workspace, and add cursor-based "Load More" pagination for older history.
Background
Previously, every workspace started a full
onChatsubscription that replayed from the penultimate compaction boundary, meaning all workspaces eagerly loaded two epochs of history regardless of whether they were visible. This caused unnecessary data transfer at startup and steady-state bandwidth waste.This PR introduces three key changes:
skip=0(latest boundary only) instead ofskip=1onChatstream; all others use a lightweight metadata-only activity feed for sidebar indicatorsImplementation
Backend (
agentSession.ts,historyService.ts, ORPC schemas/router)emitHistoricalEvents()now callsgetHistoryFromLatestBoundary(workspaceId, 0)workspace.history.loadMoreendpoint returns a single older boundary window with cursor-based paginationgetHistoryBoundaryWindow()helper scans boundaries to return exactly one epoch window at a timeFrontend aggregator (
StreamingMessageAggregator.ts)pruneBeforePenultimateBoundary→pruneBeforeLatestBoundaryFrontend store (
WorkspaceStore.ts,WorkspaceContext.tsx)addWorkspace()no longer startsrunOnChatSubscription()— subscription is managed byensureActiveOnChatSubscription()onChatstream active at a time, switched viasetActiveWorkspaceId()workspace.activity.list/subscribe) provides streaming/recency/model fallbacks for non-active workspacescaught-up, exposed ashasOlderHistory/loadingOlderHistoryinWorkspaceStateFrontend UI (
ChatPane.tsx)hasOlderHistoryis trueValidation
make typecheck✅make lint✅make fmt-check✅bun testacross 3 targeted test suites (150 tests, 427 assertions) ✅bun test src/node/services/agentSession(43 tests) ✅Risks
historySequencemetadata may cause pagination to stop earlier than expected (safe degradation rather than crash).📋 Implementation Plan
Plan: boundary-windowed chat loading + metadata-only workspace activity
Context / Why
We want chat startup to feel snappier by reducing unnecessary replay volume and decoupling sidebar status from full transcript streams.
Requested outcome:
workspace.onChatstreams for every workspace; use a metadata-only subscription for cross-workspace status (new activity / stream finished), while keeping full chat streaming focused on the active workspace.This reduces startup data transfer, avoids replaying large historical tails by default, and preserves user-controlled access to older history.
Evidence
src/node/services/agentSession.tsemitHistoricalEvents()currently callshistoryService.getHistoryFromLatestBoundary(workspaceId, 1)(penultimate boundary replay) beforecaught-up.onChatreplay modes already exist (full/since/live) with cursor-based reconnect safety checks.src/node/services/historyService.tsgetHistoryFromLatestBoundary(workspaceId, skip)already supports boundary-window selection (skip=0latest,skip=1previous, etc.), which can back a Load More flow.src/browser/utils/messages/StreamingMessageAggregator.tspruneBeforePenultimateBoundary) and includes an explicit TODO for paginated older-history support.src/browser/stores/WorkspaceStore.tsaddWorkspace()immediately startsrunOnChatSubscription()for every workspace duringsyncWorkspaces().runOnChatSubscription()already usessincecursor reconnects viaaggregator.getOnChatCursor().src/common/orpc/schemas/api.ts+src/node/orpc/router.ts+src/common/orpc/schemas/workspace.tsworkspace.activity.list+workspace.activity.subscribewithWorkspaceActivitySnapshot { recency, streaming, lastModel, lastThinkingLevel }.Storage layout assessment: split
chat.jsonlinto epoch files?Recommendation: not in this iteration (keep single
chat.jsonl+ boundary/cursor pagination).Why:
HistoryService.findLastBoundaryByteOffset+readHistoryFromOffset).HistoryServicemutation APIs (appendToHistory,updateHistory,truncateHistory,migrateWorkspaceId) currently treat history as one atomic file.chat.jsonlpaths.~/.mux/sessions/<workspace>/chat.jsonl.If we eventually shard by compaction epoch, what must be added?
updateHistory(historySequence)and delete/truncate operations.Estimated additional product LoC beyond the current plan: ~400–700 LoC (+ substantial test churn).
Recommended approach (A): active
onChat+ boundary pagination + activity metadata feedNet LoC estimate (product code): ~260–360 LoC
1) Scope full
onChatstreaming to the active workspace onlyKeep
addWorkspace()for registration/aggregator creation, but manage exactly one full chat stream (the workspace currently displayed).Files/symbols:
src/browser/stores/WorkspaceStore.tsaddWorkspace,removeWorkspace,syncWorkspacesactiveWorkspaceId,activeOnChatWorkspaceIdsetActiveWorkspaceId,ensureActiveOnChatSubscriptionsrc/browser/contexts/WorkspaceContext.tsxworkspaceStore.setActiveWorkspaceId(currentWorkspaceId)in an effectDefensive points:
onChatsubscription.mode: "since"cursor,fullfallback) insiderunOnChatSubscription().2) Replay only from the latest compaction boundary on initial load
Switch
onChatfull replay baseline from penultimate boundary (skip=1) to latest boundary (skip=0).Files/symbols:
src/node/services/agentSession.tsemitHistoricalEvents()history load call + commentssrc/browser/utils/messages/StreamingMessageAggregator.tsThis keeps live behavior aligned with fresh loads: once a new boundary arrives, older epochs are pruned by default.
3) Add explicit “Load More history” API with cursor pagination
Expose a non-stream endpoint that pages older compaction epochs via a stable cursor (not page index).
Files/symbols:
src/common/orpc/schemas/api.tssrc/node/orpc/router.tssrc/node/services/workspaceService.tssrc/node/services/historyService.tsRecommended request/response shape:
Why cursor (vs skip index):
4) Implement Load More UX in chat transcript
Add a top-of-transcript control that prepends exactly one older boundary window per click.
Files/symbols:
src/browser/stores/WorkspaceStore.ts{ nextCursor, hasOlder, loading }loadOlderHistory(workspaceId)src/browser/components/ChatPane.tsxImplementation note: if
appendcannot safely preserve strict chronological order for prepends, add a dedicatedprependHistoricalMessages()path in the aggregator and assert sequence monotonicity after merge.5) Use metadata-only activity feed for non-active workspace status
Leverage existing
workspace.activity.list/subscribeas the metadata channel for unread + streaming indicators.Files/symbols:
src/browser/stores/WorkspaceStore.tsrunActivitySubscription()getWorkspaceState()/ sidebar derivationsrc/browser/contexts/WorkspaceContext.tsxThis preserves sidebar responsiveness (unread dot, streaming state, model tooltip) without paying transcript-stream costs for every workspace.
6) Tests to update/add
Files:
src/node/services/agentSession*.test.ts(or add targeted replay test)src/browser/utils/messages/StreamingMessageAggregator.test.tssrc/browser/stores/WorkspaceStore.test.tssrc/browser/components/ChatPane*.test.tsx(or nearby transcript tests)Test cases:
skip=0) and emits expectedcaught-upmetadata.loadOlderHistory()advances cursor page-by-page, prepending one boundary window per click untilnextCursor=null/hasOlder=false.onChatstreams.onChatsubscription and keepssincereconnect behavior intact.Why this approach over sequential full subscriptions?
Sequential full subscriptions reduce startup spikes but still replay transcript data for every workspace. Using one full stream (active workspace) plus metadata-only activity for the rest cuts both startup and steady-state bandwidth much more aggressively while keeping sidebar signal quality.
Validation plan
bun run test src/node/services/historyService.test.tsbun run test src/node/services/agentSession*.test.ts(targeted replay-focused cases)bun run test src/browser/utils/messages/StreamingMessageAggregator.test.tsbun run test src/browser/stores/WorkspaceStore.test.tsbun run test src/browser/components/Messages/MessageRenderer.test.tsxand/orChatPanetests (if Load More UI is touched there)make typecheckGenerated with
mux• Model:anthropic:claude-opus-4-6• Thinking:xhigh• Cost:$0.95